compute resource
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- (3 more...)
- Information Technology (0.93)
- Health & Medicine (0.68)
- Information Technology > Software (1.00)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Maryland > Baltimore County (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
DynamiX: Dynamic Resource eXploration for Personalized Ad-Recommendations
Roychowdhury, Sohini, Holeman, Adam, Amin, Mohammad, Wei, Feng, Mehta, Bhaskar, Reddy, Srihari
For online ad-recommendation systems, processing complete user-ad-engagement histories is both computationally intensive and noise-prone. We introduce Dynamix, a scalable, personalized sequence exploration framework that optimizes event history processing using maximum relevance principles and self-supervised learning through Event Based Features (EBFs). Dynamix categorizes users-engagements at session and surface-levels by leveraging correlations between dwell-times and ad-conversion events. This enables targeted, event-level feature removal and selective feature boosting for certain user-segments, thereby yielding training and inference efficiency wins without sacrificing engaging ad-prediction accuracy. While, dynamic resource removal increases training and inference throughput by 1.15% and 1.8%, respectively, dynamic feature boosting provides 0.033 NE gains while boosting inference QPS by 4.2% over baseline models. These results demonstrate that Dynamix achieves significant cost efficiency and performance improvements in online user-sequence based recommendation models. Self-supervised user-segmentation and resource exploration can further boost complex feature selection strategies while optimizing for workflow and compute resources.
- Marketing (0.85)
- Information Technology > Services (0.51)
- Asia > Taiwan (0.04)
- North America > United States > Pennsylvania (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- (3 more...)
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench
Toledo, Edan, Hambardzumyan, Karen, Josifoski, Martin, Hazra, Rishi, Baldwin, Nicolas, Audran-Reiss, Alexis, Kuchnik, Michael, Magka, Despoina, Jiang, Minqi, Lupidi, Alisia Maria, Lupu, Andrei, Raileanu, Roberta, Niu, Kelvin, Shavrina, Tatiana, Gagnon-Audet, Jean-Christophe, Shvartsman, Michael, Sodhani, Shagun, Miller, Alexander H., Charnalia, Abhishek, Dunfield, Derek, Wu, Carole-Jean, Stenetorp, Pontus, Cancedda, Nicola, Foerster, Jakob Nicolaus, Bachrach, Yoram
AI research agents are demonstrating great potential to accelerate scientific progress by automating the design, implementation, and training of machine learning models. We focus on methods for improving agents' performance on MLE-bench, a challenging benchmark where agents compete in Kaggle competitions to solve real-world machine learning problems. We formalize AI research agents as search policies that navigate a space of candidate solutions, iteratively modifying them using operators. By designing and systematically varying different operator sets and search policies (Greedy, MCTS, Evolutionary), we show that their interplay is critical for achieving high performance. Our best pairing of search strategy and operator set achieves a state-of-the-art result on MLE-bench lite, increasing the success rate of achieving a Kaggle medal from 39.6% to 47.7%. Our investigation underscores the importance of jointly considering the search strategy, operator design, and evaluation methodology in advancing automated machine learning.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Sweden > Örebro County > Örebro (0.04)
- Asia > Middle East > Jordan (0.04)
- Education (1.00)
- Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (3 more...)
- Asia > Taiwan (0.04)
- North America > United States > Pennsylvania (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- (3 more...)
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- (3 more...)
- Information Technology (0.93)
- Health & Medicine (0.68)
A Compute resources used
Table 3 shows the full results with unlikelihood training and length normalization.COP A H-Swag StoryCloze Winogrande WSC WiC FT 78 .0 PEFT methods we considered and ablate the losses. We use "Question:" and "Answer:" as Since T0 is unable to perform ICL on its own, we also compare to T5+LM, the next-step-prediction language model upon which T0 is based. Due to memory constraints and because of its improved performance, we use ensemble ICL for Table table 10 shows the T-Few ablation results. Per-dataset results of T-Few and the other top-5 methods on RAFT are shown in table 11. 18 # of Param COP A H-Swag StoryCloze WinograndeFull Model Fine-tuning 3B 81 .0
Classification of kinetic-related injury in hospital triage data using NLP
Shyam, Midhun, Basilakis, Jim, Luken, Kieran, Thomas, Steven, Crozier, John, Middleton, Paul M., Wang, X. Rosalind
Triage notes, created at the start of a patient's hospital visit, contain a wealth of information that can help medical staff and researchers understand Emergency Department patient epidemiology and the degree of time-dependent illness or injury. Unfortunately, applying modern Natural Language Processing and Machine Learning techniques to analyse triage data faces some challenges: Firstly, hospital data contains highly sensitive information that is subject to privacy regulation thus need to be analysed on site; Secondly, most hospitals and medical facilities lack the necessary hardware to fine-tune a Large Language Model (LLM), much less training one from scratch; Lastly, to identify the records of interest, expert inputs are needed to manually label the datasets, which can be time-consuming and costly. We present in this paper a pipeline that enables the classification of triage data using LLM and limited compute resources. We first fine-tuned a pre-trained LLM with a classifier using a small (2k) open sourced dataset on a GPU; and then further fine-tuned the model with a hospital specific dataset of 1000 samples on a CPU. We demonstrated that by carefully curating the datasets and leveraging existing models and open sourced data, we can successfully classify triage data with limited compute resources.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Maryland > Baltimore County (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)